The Role of Quasi-identifiers in k-Anonymity Revisited
نویسندگان
چکیده
The concept of k-anonymity, used in the recent literature (e.g., [10, 11, 7, 5, 1]) to formally evaluate the privacy preservation of published tables, was introduced in the seminal papers of Samarati and Sweeney [10, 11] based on the notion of quasi-identifiers (or QI for short). The process of obtaining k-anonymity for a given private table is first to recognize the QIs in the table, and then to anonymize the QI values, the latter being called k-anonymization. While k-anonymization is usually rigorously validated by the authors, the definition of QI remains mostly informal, and different authors seem to have different interpretations of the concept of QI. The purpose of this short note is to provide a formal underpinning of QI and examine the correctness and incorrectness of various interpretations of QI in our formal framework. We observe that in cases where the concept has been used correctly, its application has been conservative; this note provides a formal understanding of the conservative nature in such cases. The notion of QI was perhaps first introduced by Dalenius in [3] to denote a set of attribute values in census records that may be used to re-identify a single or a group of individuals. To Dalenius, the case of multiple individuals being identified is potentially dangerous because of collusion. In [10, 11], the notion of QI is extended to a set of attributes whose (combined) values may be used to re-identify the individuals of the released information by using “external” sources. Hence, the appearance of QI attribute values in a published database
منابع مشابه
Anonymity: Formalisation of Privacy – k-anonymity
Microdata is the basis of statistical studies. If microdata is released, it can leak sensitive information about the participants, even if identifiers like name or social security number are removed. A proper anonymization for statistical microdata is essential. K-anonymity has been intensively discussed as a measure for anonymity in statistical data. Quasi identifiers are attributes that might...
متن کاملButterfly: Privacy Preserving Publishing on Multiple Quasi-Identifiers
Recently, privacy preserving data publishing has attracted significant interest in research. Most of the existing studies focus on only the situations where the data in question is published using one quasi-identifier. However, in a few important applications, a practical demand is to publish a data set on multiple quasi-identifiers for multiple users simultaneously, which poses several challen...
متن کاملPrivacy Issues for K-anonymity Model
K-anonymity is the approach used for preventing identity disclosure. Identity disclosure means an individual is linked to a particular record in the published data and individual’s sensitive data is accessed .Some important information such as Name, Income details , Medical Status and Property details are considered as a sensitive data( or Attribute) because these data have to be kept secure fr...
متن کاملA Rough Set Based Efficient l-diversity Algorithm
Most of the organizations publish micro data for a variety of purposes including demographic and public health research. To protect the anonymity of the entities, data holders often remove or encrypt explicit identifiers. But, released information often contains quasi identifiers, which leak valuable information. Samarati and Sweeney introduced the concept of k-anonymity to handle this problem ...
متن کاملkACTUS 2: Privacy Preserving in Classification Tasks Using k-Anonymity
k-anonymity is the method used for masking sensitive data which successfully solves the problem of re-linking of data with an externa l source and makes it difficul t to l'e-iden tify the individual. T hus kanonymity works on a set of quasi-identifiers (public sensitive at t ributes), whose possible availability and linking is anticipated from external dataset , and demands that the released da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/cs/0611035 شماره
صفحات -
تاریخ انتشار 2006